Abstract: Recent years have observed the ability to gather a massive amount of data in a large number of domains. As the data is collected in unprecedented rate, the analysis, rather than the storage of this data becomes a challenge. According to the IDC estimation 90% of data is unstructured data which is a fastest growing data whereas the remaining is the structured data, unstructured data refers to information that either does not have predefined data model or does not fit into relational database for information access. This unstructured data are being continuously comes from various sources like satellite images, sensor readings, email messages, social media, web logs, survey results, audio, videos etc. Due to the large volume of unstructured data there is a big challenge for all the industry currently to analyse and extract a meaningful value from it. Traditional methods are adequate for analysis of structured data but these methods are not appropriate for large volume of unstructured data in order to extract knowledge.

This paper presents the summary about unstructured data analysis for the beginners or the people from academia who is interested in analysis of unstructured data to extract the knowledge to improve the business processes and performance.

 

Keywords: Unstructured data, structured data, data mining